Reinforcement Learning in Partially Observable Markov Decision Processes using Hybrid Probabilistic Logic Programs
نویسنده
چکیده
We present a probabilistic logic programming framework to reinforcement learning, by integrating reinforcement learning, in POMDP environments, with normal hybrid probabilistic logic programs with probabilistic answer set semantics, that is capable of representing domain-specific knowledge. We formally prove the correctness of our approach. We show that the complexity of finding a policy for a reinforcement learning problem in our approach is NP-complete. In addition, we show that any reinforcement learning problem can be encoded as a classical logic program with answer set semantics. We also show that a reinforcement learning problem can be encoded as a SAT problem. We present a new high level action description language that allows the factored representation of POMDP. Moreover, we modify the original model of POMDP so that it be able to distinguish between knowledge producing actions and actions that change the environment.
منابع مشابه
Non Deterministic Logic Programs
Non deterministic applications arise in many domains, including, stochastic optimization, multi-objectives optimization, stochastic planning, contingent stochastic planning, reinforcement learning, reinforcement learning in partially observable Markov decision processes, and conditional planning. We present a logic programming framework called non deterministic logic programs, along with a decl...
متن کاملProbabilistic and Decision-Theoretic User Modeling in the Context of Software Customization
Research in the field of user modeling has aimed to supersede the current “one-size-fits-all” trend in software development that forces users to change their behaviour according to the preprogrammed functions. This paper discusses aspects of user modeling and its relevance to software customization. In particular, we focus on user modeling techniques that utilize probabilistic and decision-theo...
متن کاملSequential Constant Size Compressors for Reinforcement Learning
Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Me...
متن کاملTitle:clipp: Combining Logical Inference and Probabilistic Planning
Planning on mobile robots deployed in complex real-world application domains is a challenge because: (a) robots lack knowledge representation and common sense reasoning capabilities; and (b) observations from sensors are unreliable and actions performed by robots are non-deterministic. In this talk, I shall describe a hybrid framework named CLIPP that combines answer set programming (ASP) and h...
متن کاملReinforcement Learning for Problems with Hidden State
In this paper, we describe how techniques from reinforcement learning might be used to approach the problem of acting under uncertainty. We start by introducing the theory of partially observable Markov decision processes (POMDPs) to describe what we call hidden state problems. After a brief review of other POMDP solution techniques, we motivate reinforcement learning by considering an agent wi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1011.5951 شماره
صفحات -
تاریخ انتشار 2010